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REPETITIVE  PLAY  OF  AN  UNKNOWN  GAME  AGAINST  NATURE 

V 

by  Bruno  0.  Subert 
ABSTRACT 


A  repetitive  play  o f  a  game  against  Nature  is  considered  under  the 
assumption  that  the  player  knows  nothing  about  the  game  except  his  own 
set  of  strategies.  After  each  play,  he  is  told  the  value  of  the  random 
loss  incurred  by  him,  A  strategic  rule  for  the  player  is  defined  with 
the  property  that  the  average  loss  achieves  asymptotically  the  minimum 
functional  of  the  game  in  probability  uniformly  in  all  sequences  of 
Nature's  strategies.  The  rate  of  convergence  of  expected  average  losses 
is  shown  as  well. 
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Chapter  I 
INTRODUCTION 


Let  us  consider  a  sequence  of  plays  of  a  two-person  game,  the 
generic  game,  where  the  first  player  is  "Nature,"  The  term  "game  against 
Nature"  is  widely  used  in  both  game  theory  and  statistics,  but  in  various 
senses.  Nature  is  sometimes  considered  as  a  player  with  motivations  un¬ 
specified  or  unknown  to  the  other  player  or  sometimes  as  a  player  who 
chooses  his  strategies  so  that  they  form  a  sequence  of  independent, 
identically  distributed  random  variables.  Here,  we  will  define  Nature 
as  a  player  who  selects  his  strategies  arbitrarily  but  with  no  regard 
whatsoever  to  the  actions  of  the  other  player  or  to  the  resulting  sequence 
of  payoffs.  Besides  the  fact  that  this  notion  seems  to  correspond  better 
to  the  intuitive  idea  of  Nature  as  a  passive  player,  our  definition  is 
motivated  mainly  by  the  application  of  the  problem  of  repetitive  play  of 
a  two-person  game- -called  henceforth  a  sequential  game- -to  statistics. 

Our  concept  of  Nature  implies  that  she  selects  the  whole  sequence  of  her 
strategies  arbitrarily  but  once  and  for  all  at  the  beginning  of  the 
sequence  of  plays,  which  is  exactly  the  case  considered  in  the  so-called 
sequential  compound  decision  problem  of  mathematical  statistics.  This 
concept,  which  includes  the  third  concept  mentioned  above  (called  the 
empirical  Bayes  approach  in  statistics),  differs,  however,  from  the  first 
two,  which  are,  in  any  case,  rather  vague. 

As  for  the  second  player--to  be  referred  to  as  the  player- -he  is 

considered  a  player  in  the  true  sense  of  game  theory;  that  is,  he  is 

supposed  to  select  his  strategies  with  the  Intention  of  minimizing  the 

payoff  to  his  opponents,  i.e.,  his  loss.  Since  we  are  going  to  consider 

the  whole  situation  from  the  point  of  view  of  the  player,  we  will  talk 

about  losses  rather  than  payoffs.  In  the  sequential  game,  it  is  assumed 

that  the  player  will  utilize  for  his  strategy  choices  any  information 

about  the  development  of  the  sequence  of  plays  he  may  have  obtained  or 

inferred  during  the  past  plays  in  the  sequence.  Of  course,  he  is  not 

supposed  to  know  the  sequence  of  Nature's  strategies  beforehand;  other- 

* 

wise,  his  task  would  be  trivial. 

Provided  he  knows  the  loss  function.  If  not,  then  this  case  is  included 
in  the  case  we  are  considering  in  this  paper. 
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Since  we  are  dealing  with  the  sequential  game,  the  natural  criterion 
of  player's  performance  in  the  long  run  is  the  average  loss  incurred  by 
him.  It  has  been  indicated  (see,  e.g.,  [6],  [4],  or  [9])  that  the  goal 
he  should  try  to  achieve  is  reduction  of  his  average  loss  to  the  minimum 
loss  of  a  single  game  of  identical  structure,  in  which  Nature  (the  other 
player)  would  use  the  mixed  strategy  equal  to  the  empirical  distribution 
up  to  the  point  of  the  pure  strategy  sequence  she  is  using  in  the  seouen- 
tial  game.  The  problem  thus  consists  essentially  in  finding  a  rule  for 
the  player,  which  would  guarantee  him  that  he  will  achieve  this  goal,  at 
least  asymptotically,  no  matter  what  sequence  of  strategies  Nature  uses. 

In  the  past  decade  or  two,  several  papers  have  dealt  with  this 
problem,  especially  its  application  to  statistics.  After  the  pioneering 
works  of  H.  Robbins  [6]  and  A.  Spacek  [8],  who  both  confined  themselves 
to  the  empirical  Bayes  problem,  two  basic  strategic  rules  with  the  desired 
property  were  stated  by  D.  Blackwell  in  [2],  [3],  and  by  J.  Hannan  in  [4]. 
All  the  other  rules  suggested  later  were  derived  essentially  from  one  of 
these  two  basic  rules. 

Various  approaches  to  the  problem  may  be  classified  according  to 
the  assumptions  made  about  the  information  available  to  the  player. 

This,  in  turn,  may  be  divided  into  assumptions  about 

(1)  the  knowledge  of  the  generic  data  of  the  game,  i.e.,  the  sets 
of  strategies  and  the  loss  function,  and 

(2)  the  data  received  during  the  sequence  of  plays,  i.e.,  for 
example,  the  strategies  used  by  Nature  in  past  plays  of  the 
sequence  or  the  losses  incurred. 

Both  D.  Blackwell  and  J.  Hannan  assumed  that  (l)  the  player  has 
complete  knowledge  of  the  generic  game  and  that  (2)  after  each  play,  he 
learns  the  strategy  Nature  used.  In  the  statistical  version  of  the 
sequential  game,  the  variety  of  possible  assumptions  in  either  category 
1  and  2  is,  of  course,  much  wider.  Nevertheless,  as  far  as  is  known  to 
the  author,  it  has  always  been  assumed  that  at  least  (l;  the  player  knows 
the  loss  function  of  the  generic  game  and  (2)  a  random  estimate  of  some 
sort  is  available  to  estimate  the  empirical  distribution  of  Nature's 
strategies . 
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In  this  paper,  we  are  going  to  make  entirely  different  assumptions 
about  the  player's  information.  We  will  assume  that: 

(1)  Except  for  his  own  set  of  strategies  from  which  to  choose,  the 
player  knows  nothing  about  the  generic  game;  i.e.,  he  knows 
neither  the  set  of  Nature's  strategies  (which  may  be  infinite) 
nor  the  loss  function.  Moreover,  the  loss  function  itself  is 
supposed  to  be  random. 

(2)  After  each  play,  the  player  is  told  the  value  of  the  random 
loss  incurred  by  him  in  the  play. 

We  will  define  a  rule  for  the  player  based  on  these  two  requirements. 
Later  we  will  show  that,  under  relatively  moderate  assumptions' -namely , 
the  set  of  player's  strategies  are  finite  and  the  random  loss  function 
has  a  nonnegative  mean  and  uniformly  bounded  third  moment- -the  rule  pos¬ 
sesses  an  optimality  property  similar  to  those  of  Blackwell's  and 
Hannan's  rules.  More  precisely,  we  will  show  that  the  difference  of 
the  average  loss  and  the  appropriate  value  of  the  minimum  functional 
of  the  generic  game  goes  to  zero  in  probability  uniformly  in  all  sequences 
of  Nature's  strategies. 

To  illustrate  the  extent  to  which  the  player's  information  is  re¬ 
stricted,  let  us  consider  the  following  simple  example.  Suppose  that 
both  Nature  and  the  player  each  has  two  strategies,  say 
and  a^0/,(  a^\  respectively.  The  player  is  asked  to  play  repeatedly 
one  of  the  two  games: 


PLAYER 


a(0) 

a<l> 

NATURE 

0 

X 

*(1) 

X 

0 

GAME  1 


PLAYER 


.<»> 

NATURE  t3^°  ^ 

X 

0 

*(1) 

X 

0 

GAME  2 


where  the  entry  0  means  that  the  player's  loss  is  zero,  while,  if  the 
entry  is  X,  a  coin  is  tossed  and  the  player  incurs  a  loss,  say  $1,  if 
the  outcome  of  the  toss  is  a  head,  and  zero  if  it  is  a  tail.  Clearly, 
the  best  rule  for  game  2  would  be  to  use  the  strategy  a^^  all  the  time. 
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On  the  other  hand,  the  best  rule  for  game  1  will  depend  on  the  relative 
frequencies  of  each  and  3  in  the  sequence  of  Nature's  strategies 

and  is,  therefore,  different  from  the  first  one.  However,  by  our  assump¬ 
tions,  the  player  knows  nothing  about  the  game  but  the  set  (a^°\  a^} 
and  therefore  he  cannot  distinguish  between  game  1  and  game  2.  Thus  he 
cannot  decide  which  of  the  two  rules  mentioned  he  should  use,  even  if  he 
were  supplied  some  information  about  the  sequence  Nature  is  going  to  use. 
The  rule  defined  in  Chapter  III  of  this  paper  is,  howtver,  invariant 
with  respect  to  the  game  structure  and  allows  the  player  to  do  as  well 
as  if  he  were  told  both  the  relative  frequencies  of  and  which 

of  the  two  games  he  has  to  play. 

The  rule  is  relatively  simple  and  more  or  less  suggested  by  intuition. 
Before  each  play  in  the  sequence,  the  player  decides  whether  the  play  is 
going  to  be  a  test  play  (aimed  to  gain  information)  or  an  active  play 
(aimed  to  minimize  the  loss).  These  decisions  are  based  on  the  outcomes 
of  random  experiments- -independent  flips  of  a  coin  where  the  probability 
of  a  head  (determining  a  test  play)  goes  to  zero.  At  a  test  play,  a 
strategy  is  chosen  randomly  with  equal  probabilities.  At  an  active  play, 
the  strategy  is  selected  for  which  the  loss  accumulated  during  the  past 
test  plays  was  minimum.  In  other  words,  each  strategy  is  tested  from 
time  to  time,  more  and  more  infrequently  but  still  often  enough  to  guar¬ 
antee  the  adequacy  of  the  estimate  obtained  for  the  player's  decisions. 

The  rule  is  defined  in  Chapter  III.  Chapter  II  introduces  the 
notation,  basic  assumptions,  and  properties  of  the  generic  game.  In 
Chapter  IV  several  lemmas  are  proven;  these  are  needed  for  the  proofs 
of  Chapter  V.  Chapter  V  contains  the  main  theorem  (Theorem  l),  in  which 
the  convergence  of  average  losses  is  established;  Theorems  2  and  3,  which 
give  the  rate  of  convergence  of  expected  average  losses;  and  Theorems  4 
and  5,  which  deal  with  the  special  case  when  Nature's  moves  constitute  a 
sequence  of  independent  identically  distributed  random  variables.  Dis¬ 
cussion  of  the  results  and  comments  on  possible  generalizations  are  con¬ 
tained  in  Chapter  VI. 

In  this  paper,  we  confine  ourselves  deliberately  to  the  case  of 
sequential  games  against  Nature  and  do  not  extend  the  results  to  the  more 
general  case  of  the  sequential  compound  decision  problems.  Our  intention 
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is  to  investigate  the  game  situation  in  detail  and  thus  to  establish  a 
basis  for  further  extension  to  the  statistical  decision  case.  It  has 
been  shorn  by  the  author  [10]  that  this  can  be  done.  It  is  hoped  that 
this  work  may  stimulate  further  effort  in  this  direction. 
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Chapter  II 
PREREQUISITES 


A.  Notation 

Throughout  this  paper,  the  symbol  (ft,/<,P)  will  always  denote  the 
basic  probability  space,  where  ft  with  generic  elements  u>  is  the  set  of 
elementary  events,  A  is  a  u-field  of  subsets  of  ft,  and  P  is  a  probability 
measure  on  (ft, 4).  Random  variables  will  be  designated  by  capital  letters, 
with  the  argument  ui  omitted  unless  necessary.  Sets  from  A  defined  as 
sets  of  those  ui€ft  for  which  a  statement  9  is  true  will  be  denoted  by  {?). 
The  indicator  function  of  a  set  M  will  be  denoted  by  1^;  the  complement 
of  a  set  MCft,  by  M  .  The  expectation  of  a  random  variable  X  will  be 
denoted  by  E{X) ;  conditional  expectation  of  X  given  the  crfield  JCA 
induced  by  a  random  variable  Y  will  be  denoted  either  by  E{X|Y)  or  by 
E(X|J)  and  used  in  the  sense  of  the  definition  in  [5],  p.  341;  similarly, 
for  conditional  probability.  Other  symbols  and/or  definitions  will  be 
used  in  accordance  with  [5]. 

As  for  other  mathematical  symbols,  a  real-valued  function  f  will 
sometimes  be  written  as  f(.)  to  distinguish  it  from  its  value  f(x)  for 
the  argument  x.  If  f  is  defined  on  a  finite  ordered  set  A,  we  will  also 
use  the  symbol  f(.)  to  denote  the  Euclidean  vector  with  coi  iponents  f(a), 
aeA.  The  symbol  Rm  will  stand  for  m-dimensional  Euclidean  space;  the 
components  of  a  vector  x€Rm  will  be  denoted  by  superscripts  in  paren¬ 
theses:  x  =  (x^,...,x^m^).  The  inner  product  of  two  vectors  xeRm 
and  yeRm  will  be  denoted  by  x»y.  If  (xn:  n  =  1,2,...)  is  a  numerical 
sequence,  the  symbol  (xn)  will  denote  the  arithmetic  mean 

n 

- n'1 1  xk  • 
k=l 


If  {y^  >  0:  n  =  1,2,...)  is  another  sequence  of  real  numbers,  the  symbol 

x  =  o(y  )  will  designate  the  property 
n  n 
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limsup  t-  ni  <  +  oo  . 

n-*  +oo  I  I 


B.  The  Generic  Game 

As  mentioned  in  the  Introduction,  a  sequential  game  consists  of 
repetitive  plays  of  a  generic  game.  In  this  section,  we  will  define 
and  investigate  some  properties  of  the  latter.  Since  our  model  of  a 
sequential  game  should  serve  primarily  as  a  basis  for  the  sequential 
compound  decision  problem,  the  terminology  and  symbols  introduced  below 
differ  slightly  from  those  common  in  game  theory  proper. 


(1) 


Let  0  be  an  abstract  set--the  set  of  parameters  ■8;  let  A  =  {a 
a^}--the  set  of  strategies --be  a  finite  set  of  m  elements. 

Let  10  =  {w(8,a):  $€0,  a€A}  be  a  two-parameter  family  of  integrable 
random  variables  such  that  for  every  Well) 


0  <  E{W}  <  +  oo  .  (2.1) 

We  will  call  10  the  random  loss  function.  The  triplet  (0, A,  10)  will  be 
referred  to  as  a  generic  game. 

In  this  paper,  we  will  always  assume  that  the  generic  game  (0,A,IO) 
is  nondegenerate  in  the  sense  that  for  every  aeA  there  exists  #e0  such 
that 

E{W($,a)}  i  0  .  (2.2) 

Clearly,  there  is  hardly  any  loss  of  generality  in  this  assumption  since 
we  can  always  augment  the  set  0  and  the  family  10  so  that  (2.2)  is  true. 

In  subsequent  sections,  we  will  be  making  some  further  assumptions 
concerning  integrabllity  of  the  family  10.  For  reference  purposes,  let 
us  call 

Assumption  (10;  r);  r  >  1  an  integer:  There  exists  a  finite  constant 
CQ  such  that  for  every  and  neA 

[E|w(*,.)|r]  /  <C„  . 
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Assumption  (10;  qq):  There  exists  a  finite  constant  CQ  such  that  for 
every  ,je0  and  aeA 

|w(^,a)|  <  CQ  .  a.s. 


Next,  we  have  to  introduce  mixed  strategies.  The  set  A  beini 
finite,  a  mixed  strategy  Ct  is  simply  defined  as  a  vector  a  =  (a^1 
Q,(ra))  from  the  (m-l)-dimensional  probability  simplex  Q  in  Rm, 


a  = 


aeR 


m 


a 


(i) 


>  0;  i  =  1 


m;  ^  =  1 

i=l 


(2.3) 


We  will  need  a  similar  concept  defined  for  the  set  of  parameters  0. 
Let  7  be  the  cr-field  of  all  subsets  of  0;  let  T  be  the  class  of  all 
finite  signed  measures  on  the  measurable  space  (@,7)  defined  by  the 
property:  for  each  TeT  there  exists  a  finite  set  . such  that 

t(0  ■  { i3^ >  •  •  • ,  $n) )  =  0  , 


In  other  words,  all  the  measures  TeT  are  purely  atomic  with  a  finite 
number  of  atoms.  As  an  analog  of  the  set  of  mixed  strategies  G  will 
serve  the  subclass  Tq  of  all  probability  measures  in  T.  If  T€Tq  is 
such  that  t({-8})  =  1  for  some  fl€0,  we  may  write  simply  instead  of  T. 

From  now  on,  let  us  denote 

w(^,a)  =  E(w($,a) }  ,  (2.4) 


where  w($,a )eU),  and  let  for  every  T€T,  aeQ, 


m 

w(T,a)  =  f  Y  w($,a^)  dx(-i3)  . 

JQ1=1 


(2.5) 


Since,  by  the  definition  of  the  class  T,  the  integral  in  (2.5)  is  only  a 
finite  sum,  w(.,.)  is  a  well  defined  finite  function  on  T  X  (J  .  More¬ 
over,  it  is  easily  seen  that  with  addition  and  multiplication  by  a 
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constant  naturally  defined  on  both  T  and  U,  w(.,.)  Is  also  a  bilinear 
functional  on  T  x  d. 

Let  (@,A,IL')  be  a  generic  game  with  a  random  loss  function;  let  for 
every  T€T, 

<p(T)  =  min(w(T,a))  .  (2.6) 

acA 


We  will  call  m  the  minimum  functional  of  the  generic  game  (0,A,ir). 

The  minimum  functional  has  a  simple  property.  Let  s  =  (e  ‘  * 

*(m)  \  m  ,,  /  (l) 

s  )  be  a  mapping  from  R  into  U  defined  for  every  x  =  (x  . 

.(m) 


xvraOeRm  by 


if  x ^ 1 ^  <  min  {x^} 

J “1  t  •  •  •  >  ® 

j^i 

if  x^  =  min  {x^} 

j=l  i  •  t  •  |IQ 

if  x^  >  min  {x^} 

J  =1  t  •  •  •  |® 


(2.7) 


i  =  l,...,m.  Then,  clearly,  for  every  TeT, 


qp(T )  =  w(t,s*(w(t,  . ))) 


(2.8) 


We  will  use  this  property  to  prove  the  following  lemma  to  be  needed 
later. 

Lemma  1.  Let  T€T,  x€Rm,  0^  =  s*(w(t,.)  +  x),  . 


Then 


wOr.ci^)  -  w(x,a2)  <  x»  (c^-o^)  . 


(2.9) 


Proof . 

Let  t  eT  be  such  that  w(t  , .)  =  x.  It  is  easy  to  see  that  such  t 

X  X  X 

exists  for  every  xeRm  since,  by  assumption  (2.2),  we  can  always  find 
. jjce  such  that  w^.a^)  ^  0  for  every  i  =  l,...,m  and  then 
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set  tx((i3i})  =  x^i^[w(,3i,a^i^)]  1 ;  i  =  l,...,m;  and 

^x(0  ■  { 1  •  •  •  > ) )  =  0. 

Next,  by  (2.6)  and  (2.7) 

cp(T  +  tx)  =  w(t  +  tx,a1)  <  w(t  +  tx,a2)  ,  (2.10) 

which  implies 

0  <  w(t  +  tx,a2)  -  w(t  +  t^Ofj)  .  (2.11) 

Finally,  linearity  of  w(.,a)  gives 

w(t  +  tx,a^)  =  w(t,o^)  +  x*a^;  A  =  1,2  ;  (2.12) 

which  together  with  (2.1l)  gives  (2.9). 

This  completes  the  Proof  of  Lemma  1. 

C.  The  Sequential  Game 

In  the  sequential  game  against  Nature,  Nature  first  selects  a 
sequence  of  parameters  from  parameter  space  0.  For  further 

purposes,  we  will  assume  that  the  sequence  always  begins  with  a  "dummy" 
parameter  such  that 

w(v*) =  woK”) =  0  •  (2.13) 

00 

We  will  denote  the  set  of  all  sequences  =  {^n:  n  =  0,1,...}  by  0  , 
this  representing  the  set  of  all  Nature's  strategies  in  the  sequential 

00 

game.  With  every  sequence  n  =  0,1,... }£0  ,  we  will  associate 

a  sequence  {t^eT:  n  =  0,1,...}  defined  by 

n 

V®)  =  n  I  W;  BeJ:  n  =  %(•)  =  0  •  (2.14) 

k=l 


Thus,  for  n  =  1,2,...,  T  £T  is  the  empirical  distribution  of  $  defined 

n  o 

by  ,6  },  and  T  (b)  is  the  proportion  of  3  's,  k  =  l,...,n,  in  B. 

inn  k 
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Let  (a^:  n  =  1,2,...)  be  a  sequence  of  pure  strategies  used  by  the 
player.  The  result  of  the  sequential  game  Is  then  a  sequence  of  random 
losses  defined  as  a  sequence  of  Independent  random  variables 

{Wn(Vftn):  n  =  -  (2. 15 ) 


where  the  random  variable  W  ,a  )  Is  distributed  as  w(i3,a)€U'  for 

n  n  n 

=  -fl  ,  a  =  a  . 

n  n 

As  mentioned  before,  the  player  selects  his  mixed  strategies  at  each 
play  on  the  basis  of  his  past  experience,  using  for  his  choices  a  stra¬ 
tegic  rule,  which  tells  him  for  each  n  =  1,2,...  the  mixed  strategy 
he  has  to  use  at  the  n-th  play.  Since  each  thus  depends  on  the  out¬ 
comes  of  the  past  plays  and  since  the  strategic  rule  itself  may  be  a 
randomized  rule,  the  sequence  (S^:  n  =  l,2,...)--to  be  called  the  sequence 
of  mixed  strategies  generated  by  the  rule- -is  a  sequence  of  random  vectors 
with  values  in  0. 

A  strategic  rule  also  generates  a  sequence  of  pure  strategies 
{'F^:  n  a  1,2,...]  defined  as  a  sequence  of  random  variables  with  values 
in  A  such  that  if  7  denotes  the  onfield  induced  by  the  family  (S  , ..., 

n  00  A 

s  ;  W  (3  , .) . W  (3  ,.)}  then  for  every  n  =  1,2,...;  £e&  ,  i  =  l,...,m 

n  x  X  n  n 


and 


(2.16) 


»  and  W  (■$  ,.)  are  independent.  (2.17) 

*n  n  n 
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Chapter  III 
THE  STRATEGIC  RULE 

In  this  chapter,  we  will  define  a  strategic  rule  satisfying  the 
assumptions  we  made  about  the  information  available  to  the  player. 

Later  (Chapter  V)  we  will  show  that  this  rule  is  weakly  asymptotically 
optimal  in  the  sense  that  if  n  =  1,2,...)  is  the  sequence  of  pure 

strategies  generated  by  it,  then 

n 

n  I  VVV  -  S  0 

k=l 

00 

uniformly  in  .  This  means  that,  by  using  this  rule,  the  player  can 
do  as  well  as  i (  he  knew  all  the  data  about  the  game  structure  and  if  he 
were  told  the  asymptotic  empirical  distribution  of  the  tf's  in  the  sequence 
Nature  is  going  to  use.  Moreover,  he  can  do  this  uniformly  in  all 
Nature's  possible  choices.  More  precisely,  given  €  >  0  and  6  >  0,  there 
exists  a  positive  integer  n(«,B)  with  the  property  that,  if  the  number 
of  plays  exceeds  n(€,6),  the  average  loss  will  differ  from  the  goal  cp 
more  than  e  with  probability  less  than  6  no  matter  what  the  sequence  of 
-3's  was  used  by  Nature.  The  integer  n(e,6)  can  be  obtained  from  Theorem 
2  or  Theorem  3  of  Chapter  V,  which  give  the  rate  of  convergence. 

Let 

{Ur:  n  =  0,1,...)  (3.1) 

be  a  sequence  of  independent  random  variables,  taking  values  0  and  1, 
and  let  for  every  n  =  0,1,... 

P{U  =  1)  =  p  >  0  . 

1  n  n 

Further,  let 

<V  n  =  0,1,...) 


(3.2) 

(3.3) 
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be  a  sequence  of  independent  identically  distributed  random  vectors 
taking  values  in  the  set  Q  and  such  that  for  every  i  =  l,...,m, 


'{v„ 3  H . • 


(3.4) 


where  is  the  Kronecker  delta. 

The  random  variable  U  determines  whether  the  n-th  play  will  be  a 

n 

test  play  (u  =  l)  or  an  active  play  (u  =  0),  while  the  V  determines 
n  n  n 

the  strategy  to  be  used  in  a  test  play. 

The  sequences  (3.l)  and  (3.3),  as  well  as  the  sequence  (2.15),  are 
also  assumed  to  be  mutually  independent  (for  every  jj€0°°). 

Next ,  let 


(Yn:  n  “  °’1, * ’ *  ^ 


(3.5) 


be  a  sequence  of  random  vectors  with  values  in  Rm  defined  for  every 
£€0°° ;  n  =  0,1, . . .  by 


Y 


n 


(3.6) 


where  q  =  (r/1^ , . . .  .t/”^)  is  the  vector  with  all  components  =  1. 

Thus  the  vector  Y  has  either  all  components  zero  (if  Un  =  0)  or  has 

n  (i) 

only  one  nonzero  component,  namely  that  one  for  which  V  =1.  the  com* 

•  1  ^ 

ponent  being  then  equal  to  mpn  (l  +  Wn($n,i)). 

The  sequence  (3.5)  therefore  represents  the  information  the  player 
is  receiving  along  the  sequence  of  test  plays,  namely,  the  pure  strategy 
used  and  the  loss  incurred. 

The  strategic  rule  we  are  suggesting  is  defined  by  means  of  the 
sequence  of  mixed  strategies  (Sn:  n  =  1,2,...}  generated  by  it  as 
follows : 


S 

n 


U  V  + 
n  n 


(*) 


where  s  is  the  mapping 


(2.7). 
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Notice  that 


if  the  n-th  play  is  an  active  one,  thus  selecting  the  strategy  a^eA 
for  which 


n-1 


k=0 


is  minimum  and  in  a  test  play  S  =  V  selects  any  aeA  with  equal 

n  n 

likelihood . 

We  will  refer  to  this  strategic  rule  as  to  the  strategic  rule  (*). 
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Chapter  IV 

SOME  AUXILIARY  RESULTS 

In  this  chapter,  we  are  going  to  prove  several  lemmas,  which  will 
be  used  in  the  next  chapter. 

00 

Let  £  =  {$n:  n  =  0,1,...  }€0  be  a  sequence  of  $'s;  let  {Yn: 
n  =  0,1,...)  be  the  sequence  of  random  vectors  (3.5).  By  the  assumption 
of  integrability  of  the  random  loss  function 


Eft°}*  i  *  «(v*(l));  1  ■  1 . " 


(4.1) 


exists  and  is  finite  so  that  we  can  center  the  random  vectors  Y  at 

n 

expectations: 

\  ■  (?ft . ‘  Yl‘)  ’  j i  1  =  1 . “  '  <4'2> 

We  will  need  bounds  for  absolute  moments  of  the  random  variables 
Y^\  Let  us  assume  that  the  assumption  (1C; r )  holds  for  some  r  >  3, 
and  let  1  <  r'  <  r.  We  have 

*  |-[1  +  »(v»(i))]| 


■(v*(i,)r']  ... 


+  21  + 


(4.3) 


by  the  C  -inequality  ([5],  p.  155)  and  the  fact  that 


^  r 

m  p  <  1  <  m  p 
n  —  —  n 


i-l  i-ri 


17 


SEL-67-098 


/ 


Further,  by  (U';r)  there  exists  a  finite  constant  such  that,  say 


E[l  +  Wn(^,a)|r  <  for  every  i3€0,  a€A  , 


and  consequently  by  Jensen  Inequality  also 


|l  +  w(-g,a)|  <  for  every  $ €0,  aeA 

Hence  and  from  (4.3)  we  have  immediately 

Kiv>ir'<<vr'  s-vn-'- 


Next,  let  us  introduce  the  random  variables 


(4.4) 


(4.5) 


(4.6) 


r(ij)  _  yU)  .  y( J ) .  A  _  1 , . . ,  (m ;  j  =  1 . .  n  =  0,1,...  .  (4.7) 


By  the  C  'inequality  and  (4.6)  we  have  now 
r 

<  (4C1)r’  m^,■1pJ;"^,  .  (4.8) 

j  )  i 

We  will  need  also  a  lower  bound  for  the  variance  of  ,  i  f  j. 

Direct  computation  gives 

[2  2  n  2 

El1  *  "n(v*(l))l  +E|1  +*nk’ft(j,)l  J-  IA»‘J)|  9) 


where  we  denoted 


\[1J)  *  *(v*(1))  -  w(v,(JJ)  • 


(4.10) 


Using  again  Jensen  Inequality 


E 1 1  +  wn(va)|2  -  I1  +  w(va^2  ’ 


(4.11) 
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we  obtain  from  (4.9) 

Kl?i1J>l  2  2"^1  [*  *  ,(\’,<1))  +  *(\-*<J))] 

*  ("p;1  -  i)[»2(v,(1V"2(v*<J))] 

+  2w^n,a^i^w  (i3n.a^j^)  >  2mp^1 

+  (Bpn1  ‘  ^Pk’*11')  +  »2(’)n’*(j))]  <4-12> 

since  w(.,.)  is  nonnegative  [assumption  (2.l)].  Hence,  if  p  <  m  1, 

n 

we  have 

E|Y^1J^|  >  2mp'1  .  (4.13) 

Thus  we  have  proven 

^(i )  ~(i  j  ) 

Lemma  2.  Let  and  Yn  ;  i  =  l,...,m;  j  =  1 . .  be  the  random 

variables  (4.2)  and  (4.7),  respectively;  let  assumption  (I0;r)  for 
some  r  >  3,  be  satisfied.  Then  there  is  a  constant  <  +  »  such 
that  for  every  n  =  0,1,...  and  1  <  r'  <  r 


(4.14) 

(4.15) 

(4.16) 
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psnsyggfti?;  ' .  t-  *  •  ■  "sws  ...»  .» •.»«  - 


’)  «MH» 

-,■!  r 


fl 


K 

ft 

jfl 

i 


~(i  1 ) 

Since  the  random  variables  J/;  n  =  0,1,...;  are  independent, 
their  consecutive  sums  have  another  property  expressed  by 


Lemma  3.  Let 


Z(U)  .  „-i/a  V  ?<U) 

n  L  k 

k=0 


(4.17) 


n  =  1,2,...;  i  =  1 , . , . ,m;  j  =  1 , . . .  ,m;  i  ^  j,  let  p^<m  * ; 

n  =  0,1,...;  let  assumption  (l0;3)  hold.  Let  j(x  ,x  )  denote 

1  « 

either  of  the  intervals  [x.,x  ),  (x  ,x_],  where  -oo  <  x  <  x  <  +»  . 

1  4  1  «  12 

Then 

r1/2 

>{z<1J,€  J^.Xj)}  <  (x2-x1)(2»)-1/2  „1/2I  y  p‘‘ 


k=0 


+  32C 


"1/2  I  C(  2  Px1 


-3/2 


(4.18) 


k=0 


,  k=sO 


where  is  the  constant  from  Lemma  2  and  0  is  the  Berry-Esseen 
constant. 


Proof . 


Let  F^iJ^ 
n 


denote  the  distribution  function  of  the  law 

-l/2  n 


2 E' 

l  k=0 


s(ij)i 
k  I 


»(1J) 

k 


k=0 


let  C  be  the  distribution  function  of  the  normal  law  }l(0,l).  Since  the 
~(i  1 ) 

random  variables  are  independent,  centered  at  expectations  and,  by 

Lemma  2,  have  positive  variances,  the  Berry-Esseen  normal  approximation 
theorem  ([5],  p.  288)  applies,  yielding 
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n-*-;**  rtWWM'iyvw.mrT* 


xe( 


.up  |F<lj)(x)-0(x)|  <p  £  E|?<1J)|  (  £  K|?^1J)|2\ 
1  '  k=0  \  k=0  / 


(4.19) 


,(ij ) 


Let  FViJ/  be  the  distribution  function  of  the  law  £(z^^);  let 
n  n 


Since 


_(ij) 


CT 


n 


z(u) 


we  have 


(x)  =  FniJ^(XcrfiJ^ 
n  \  n 


Further,  using  Lemma  2,  we  find 

v-3/2 


n 


I  E|^1J)|  (  J  E|y(lj)!  <  lecj  m1/2  £  p-2f  £  p'1 


■3/2 


k=0 


,  k=0 


k=0  \k=0 


Thus  (4.19)  becomes 


xS(T«Jr"iJ)<x)  -  -l/2  i  ’Hi  ^ 

X£^ ’  '  k=0  \k=0  / 


-3/2 

(4.20) 


Finally,  by  the  well  known  property  of  the  distribution  function  G  and 
Lemma  2, 


x1)(2m) 


1/2 


(4.21) 
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Inequalities  (4.20)  and  (4.2l),  together  with  the  fact  that  the  Berry- 
Essen  Theorem  holds  both  for  right-continuous  and  left-continuous  distri 
bution  function  prove  (4.18). 

The  following  lemma  is  a  trivial  generalization  of  the  law  of 
large  numbers . 

Lemma  4.  Let  (X^:  n  =  1,2,...]  be  a  sequence  of  random  variables  with 

distributions  depending  on  a  sequence  of  parameters  ^  =  {$  : 

00 

n  =  1,2, ...}€0  ;  let  n  =  0,1,...)  be  a  sequence  of  sub-cr- 

fleld  of  A.  If 


n 

n  2  E|Xfc | 2  -*  0  uniformly  in  ££0°°  (4.22) 

k=l 

and  if  the  family  (X^...^)  is  ^-measurable  for  every  n  *  1,2,.. 
j8€0°°,  then 

n  p  oo 

1  ^  (Xk  -  E{Xk|jk  l})  -*  0  uniformly  in  £€0  .  (4.23) 

k=l 


Proof  of  Lemma  4. 

The  proof  is  straightforward.  Let  e  >  0,  let  X^  =  Xr  -  E(xn|^n.i). 
By  Tchebichev  Inequality 


n 

1  yx- 

>  e 

►  <  E 

n 

y  x'l 

n  La  k 
k=l 

* 

-  2  2  ^ 
n  e 

L,  k  | 
k=l 

(4.24) 


However,  since  by  assumption  {X^,...,Xn)  is  J^-measurable ,  the  random 
variables  X^  are  centered  at  expectations  given  the  predecessors  so  that 
the  extended  Bienaym^  Equality  ([5],  p.  386)  holds.  Therefore 

E|  i  *if = i  <  i  eixj2  , 

k=l  k=l  k=l 
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which,  together  with  (4.24)  and  assumption  (4.22)  gives  (4.23), 

This  completes  the  Proof  of  Lemma  4. 

Later,  we  will  use  one  simple  lemma  on  sequences  of  positive  real 
numbers . 

Lemma  5.  Let  {p^:  n  =  0,1,...)  be  a  sequence  of  positive  numbers  such 
that  for  every  n  =  0,1,... 


(n  +  l)p  ,  >  np 
n+1  —  n 


(4.25) 


Then 


n-1 


liminf 
n-*+oo  np 


^2  p;1 


>  o 


n  k=0 


(4.26) 


Proof . 

By  (4.25)  we  have  for  every  n  =  1,2,... 


-1  .  k  -1  , 

p,  >  -  p  ,  k  =  0, . . .  ,n-l 
k  —  n  n 


Hence 


n-1 


n-1 


p-i>iy 

-lZ,k-nZ,n  2n  2 
npn  k=0 


k=0 


(4.27) 


(4.28) 


as  n  -*+oo. 

Lemma  5  is  proven. 

The  remaining  three  lemmas  constitute  essential  parts  of  the 
theorems  in  the  next  chapter. 

'V 

Lemma  6.  Let  {Yn:  n  =  0,1,...)  and  (Yn:  n  =  0,1,...}  be  the  sequences 
of  random  vectors  (3.5)  and  (4.2),  respectively;  let 

n 


S'  =  s 
n 


Y^j,  n  =  0,1, ,, ,  , 


(4.29) 


>k=0 
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and  let 


(4.30) 


(l£;r)  hold  for  some  r  >  3  . 


If  the  sequence  {p^:  n  =  0,1,...)  of  probabilities  (3.2)  satisfies 
the  conditions 


p„t  °  , 


(4.31) 


np  t  +oo 
n 


(4.32) 


1  V'  'v  P  00  /  V 

—  y  Y  •  S '  0  uniformly  in  all  sequences  _££0  .  (4.33) 

n  k  k 


Proof . 

Let  ^  be  the  random  variables  (4.7).  Since  S'efl  for  all 
n  n 

n  =  0,1,...,  we  have  the  identity 


mm  n 


m  n 


1=1  j=l  k=0 


i=l  k=l 


By  Lemma  2  (4.14)  and  the  condition  (4.3l),  we  have  for  every  j  =  l,...,m 
n  =  1 1 2 1  • « i 

n  n 

n'2  E|Y^|  <  (2C1)2mn"2  ^  p'1  <  (2^ )2m(npn)‘ 1  ,  (4.35) 


I 

which  goes  to  zero  by  (4.32).  Since  the  random  variables  Yr  are 
independent  and  centered  at  expectations ,  the  weak  law  of  large  numbers 

/w(i  )  (  H  )  \ 

(or  Lemma  4  with  5f  induced  by  {Yv  Y'  ))  yields 

n  in 


1  V  ~(i)  P  jo 

-  >  Y,  '  =►  0  uniformly  in  $€0  . 

n  /  .  k  ■* 

k=l 


(4.36) 
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In  view  of  (4.34),  it  remains  to  prove  that  for  any  i  =  1 . . 

j  —  1 . m,  i  /  j, 

n 

n  X  ^  °  uniformly  in  ie0°°  •  (4.37) 


Let  9jn,  n  =  0,1,...,  be  the  a-field  induced  by  the  family  {Yo,...,Yn). 
Since,  clearly, 


fcdJJa.C1) . y(u)s.(i)l 

[  o  o  n  n 


is  ^-measurable  for  every  n  =  0,1 . and  since  by  Lemma  2  (4.15), 

(4.31),  and  (4.32), 


“  2  "  2 
n'2  E|Yflj)S'(i)|  <  n'2  E|Y^1J^|  -  0  uniformly  in  £60”  ,(4.38) 


we  obtain  (4.37)  from  Lemma  4  if  we  prove  that  for  any  i  ^  J 


n 

n  S  E{?kiJ)sk(i)  ^k-i}  *  0  uniformly ln  i€0°°  •  (4.39) 


•  n  n 

*•  i  yki)  <  t  n  i 

vk=0  *=Ar/,,,mk=0 


(4.40) 


I  ^  £ 

k=0 


<  rain  S  Y ^  > 
~  o  .  „  L  k 

^=1 . ”  k=0 


(4.41) 


V  Yu> 

Z,  Yk 


X i i«* i m 


V  Y<J> 

Z-  k 


(4.42) 
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so  that  by  the  definition  of  S' 

n 


0.(0  _  T  .  JO  T 
Sn  -  1  (1)  *  “  1  (0  • 

n  n 


whence,  since  0  <  1  and  \ 

-  —  n  n  n  ’ 


'(i)  s  siU)  £  »  (i)  • 

n  n 


Therefore,  almost  surely 


We  will  consider  first 


'  ' 

~(ij) 

E  Y»  '  (i)  Vi  ’  • 

n 


(4.43) 


'teij)lH(i)|Vi}  •  Vi} 

n  I  n 

sE{?(u)s,(i)|Vij 

S-fe%W  •  E{?nlj)lK(i)|Vl} 

n  I  n 


(4.44) 


(4.45) 


Let  (7n:  n  =  1,2,...)  be  a  sequence  of  positive  real  numbers  (truncating 
constants);  let 


rW)  .  {|?<li!)|  <7„};  l  . 


1  i  »  t  •  p 


n  r(u) 

/•  i  “ 


(4.46) 


(4.47) 
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Since  clearly 


y(ij )l  V^1J^I  <  Y^1J^  I 

v"  >>  v"  W1’ s  ■ 

n  n  n 


Yll)Y ' 


(4.48) 


and  since  the  random  variable  |Y^  J  |l ^  is  independent  of  ^  ^ 


we  have  almost  surely 


(if) 


K{?nlj)lBa)|Vi}  -  E(?nlj)l„(i)n  r(i)|vi)|  S  *{lf 


Further 


(fr  -  £  (iff  • 

Hi 


(4.49) 


(4.50) 


and  by  the  definition  of  the  random  variable  Y 

n 

-*(rf  ))C.  I  /  1  •>  if ’<“)!  >  7„  ■ 

which,  in  turn,  implies  that  either  V^(w)  =  1  or  V^^(to) 


(4.51) 


=  1,  Hence 


either 


| Y^1  (w)J  >  7n  for  all  A  =  l,...,m;  A  ±  i  , 


[Y^iA^(u))|  =  0  for  all  A  =  1 . m,  A  /  l  . 


Therefore, 


-  "  (r f  f 

£=1  '  11  ' 

implies  that  either  | Y^1-*^  ^  (uo)  |  >  7n  or  ! ^ (to)  j 


=  0  so  that 
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/ 


(4.52) 


However , 


|y(^J  )  I  2 

'(r  lli)Y 


<  y\'T  K!?iij)!  <  (4Ci)r  ml"1('’npn 


il-r 


by  assumption  (4  30 )  and  Lemma  2  (4.15),  so  that  (4.49)  becomes 


F. 


Y^I 
n  H(i) 

n 


*  E 


»(ij)T  ,  v 

n  H^nr  1 

n  n 


n-1 


<  (4C1)r  mr  1(7nPn)1  r  a>8‘  (4.53) 


Next,  let  us  denote  for  £  =  l,...,m 
n-1 


^  =  < 
n 


l  ilt)  <  -  tl)  -  1 4 


(if) 


5<a)  = 

n 


k=0 

n-1 


k=0 


(4.54) 


n 


S  ^  <  r.  - 1  . 

k=0  k=0 


(4.55) 


,(ii)  _ 


n-1 


-14 


(i£) 


k=0 


k=0 


(4.56) 


where 


the  numbers  are  defined  by  (4.10).  Also  let 


n  £=1 

& 


It  is  easily  seen  that 


H 


U)  . 


.  h<‘>  .  n  . 

£  =  1 

(4.57) 

£^i 

n  h^>  , 

£=i  n 

(4.58) 

U  i 
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and  for  i  /  i 


whence 


H(i^)c  H(i^)n  r(ii)  c  jj(U) 
— n  n  n  n 


H(i)  c  H(i)  n  h^1  ^  . 

— n  n  n  n 


(4.59) 


(4.60) 


y(AJ  )t  =  y^iJ^+I  -  V^iJ^_T 

n  h^o  n  n 

n  n  n  n  n  1  n 


<  V(ij)+I  ,  >  -  v^J^T  <  Iv^^lT  .  ?(ij)T  (A 

~  “  ff(l)  n  H(l)  ‘  1  “  1  I(i)-  H(l)  n  H(l)  ’ 

n  — n  n  — n  — n 


nnd,  similarly, 


?»lJ,,Hd)n  r(D  *  -  l?i1J,l,g(‘).  «(0  +  ?»1J)l„(D  •  (4'62) 

n  1  n 


Since,  by  definition,  both  and  are  sj  ,  -measurable  sets  we 

n  -n  Jn-1 


4?llj)lH(l)|Vl)=IH(i)K{fi1J,}=° 


a.s.  ,  (4.63) 


and  by  Lemma  2  (4.15), 


e{|?«ij)|,5w.  .<i>M 

=  V0-  „(0  E{l^nlj)|}  2  4C1  Vi).  H(l) 


a.s.  (4.64) 


We  are  now  going  to  show  that  for  suitable  choice  of  the  truncating 


sequence  {yn) 


I_(i)  (jj  ^  0  as  n  ■*«  uniformly  in  £€0 

H  -  H 
n  — n 


(4.65) 
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which  is  equivalent  to  showing  that 


,(;(0  .  g(i>) 


0  as  n  ■*  »  uniformly  in  ^€0  . 


First,  since 


h<‘>  .  .(‘>c  ; 
n  ““  i=] 


u  (h<U)  -  sW>)  . 


(4.66) 


(4.67) 


m 

?(h^1  ^  ^ 


(4.68) 


-(ii)  (il) 

Next,  the  set-theoretical  difference  II'  '  -  H'  '  can  be  written  as 

n  — n 

-HD  .  Hi)  m  /.(„.iri/2  .  {n.l}-i/2  y  AU)  <  z(i/) 

n  -n  '  n  Lu  k  -  n-1 

k=0 

n 

<  (n-l)‘l/2  7n  -  (n-l)’l/2  ]>  >  ,  n  =  2,3 . (4.69) 

tr=n 


and  since  there  is  no  loss  of  generality  by  assuming  pn  <  m  , 
n  =  0,1,,..,  Lemma  3,  together  with  (4.68),  yields 

yi/2 


n  /  n  \-3/2 

+  32cJ&m3/2  £  p'k2t  ^  p'M  ;  n  =  2,3,...  .  (4.70) 

k=0  \k=0  / 

From  the  conditions  (4.3l)  and  (4.32),  it  follows  that  the  sequence 
{pn*:  n  =  0,1,...}  satisfies  the  hypothesis  of  Lemma  5.  The  statement 
then  implies  that  there  is  a  positive  constant  cq  <  1  such  that  for  every 
n  —  1,2,... 
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n-1 

C  np  \  <  V  n 

o  n- 1  -  ^  *k 
k=0 


Moreover,  by  (4.31 )  aiso 


(4.71) 


S  p'k'  -  "pn?';  *  -  1>2  • 


(4.72) 


Applying  these  two  inequalities  to  (4. 70),  we  obtain 
p(5(l)-H(l))<?(2»)l/2c-1/2n-l/2pl/2 

\  n  — n  I  —  'n  o  n 

3  -3/2  3/2  /  \-l/2  /  v 

+  32C1  cq  '  Pm  7  (nPn)  '  ;  n  =  2,3 .  (4.73) 

and  trivially  also  for  n  =  1  (for  cq  small  enough  if  necessary). 

We  will  now  select  the  constants  y .  To  minimize  both  (4.53)  and 
the  first  term  on  the  right-hand  side  of  (4.73)  simultaneously,  we 
require  y to  satisfy  the  equation 

\r  r-lf  a-r  /0  xl/2  -l/2  -1/2  l/2  ,  _.v 

(4CX)  m  (7nPn)  =  4Cl7n(2m)  '  cq  n  Pn  •  (4.74) 

For  such  a  y  ,  the  inequality  (4.73)  becomes 

M  <  (4CJ2  2*/2  c'l/2  »3/2  (np„)l/2r-  1/2 


+  32C^  c3/2M3/2  (np  )  1//Z  >  n  =  l-2,...  > 
lo  n 


(4.75) 


and  the  right-hand  side  of  the  inequality  (4.53)  becomes 

,  \ 2  „l/2  -l/2  3/2  /  \l/2r -  l/2  ,  _  /.  __v 

(4C1)  2  '  cQ  m  (npn)  .  n  =  1,2 .  (4.76) 


Now  it  is  obvious  that  both  these  bounds  tend  to  zero  as  n  -*  +00  so  that 
(4.66)  holds,  and  by  (4.53)  we  conclude  that  (4.45)  goes  to  zero  in 
probability  uniformly  in  £ed.  Finally,  it  is  easily  seen  that  exactly 
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the  sets 


the  same  reasoning  applies  if  we  replace  the  set  and  all 

(i)  n 

related  to  it  by  the  set  and  similarly  related  analogs. 


Therefore  also 

E 


w(iJ )T 

V" 

n 


’n-l 


P  00 

-*  0  uniformly  in  £60  , 


(4.77) 


which  in  view  of  (4.44)  yields  (4.39). 
The  proof  of  Lemma  6  is  terminated. 


Lemma  7.  Let  the  hypothesis  of  Lemma  6  be  satisfied.  Then  there  is  a 
finite  constant  C 2  such  that  for  every  n  =  1,2,...  and  _£e0 


|E<V‘i)l  <  c2»5/2(npn)l/2r‘  1/2  . 


(4.78) 


Proof . 

'N* 

Since  the  random  vectors  are  centered  at  expectations, 
identity 


Y 


n 


m 


i=l 


I 


j=l 


s(ij)Q,U) 


ra 


i=l 


the 


(4.79) 


yields 


lE<Vs;n  |Ei?'1JV1))|  • 

i  “li  f  •  •  a  f  O 

J “1  $  •  •  • 


(4.80) 


Proceeding  similarly  as  in  the  proof  ol  Lemma  6,  we  conclude  that  both 
the  inequalities  (4.44)  and  (4.53)  hold  with  conditional  expectations 
replaced  by  unconditional  ones.  The  same  is  true  for  the  relations 
(4.63)  and  (4.64)  so  that  (4.75)  and  (4.76),  together  with  the  remark 
at  the  end  of  the  proof  of  the  previous  lemma,  gives  for  every  i  ^  j, 
n  =  1,2 . j^€©°° , 


lEtv^V0)!  <c2.3/2(«Pn)l/2r-l/2  , 


(4.81) 
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where  the  constant  C„  includes  the  constants  C,  ,  c  ,  Q. 

2  1  o 

Lemma  7  is  proven. 

Lemma  8.  Let  the  hypothesis  of  Lemma  6  be  satisfied  with  the  assumption 
(lf;r)  replaced  by  the  assumption  (if;00).  Then  there  is  a  finite 
constant  C3  such  that  for  every  n  =  1,2,...  and  _£€0 

!K(Yn*s;)|  <  C3m5/2(npn)'l/2  .  (4.82) 


Proof . 

The  proof  is  only  a  modification  of  the  proofs  of  Lemma  7  and  Lemma  6. 

Clearly,  (If;00)  implies  (if; 3)  so  that  the  hypotheses  of  the  two  previous 

lemmas  are  satisfied.  Moreover,  under  (If;00)  there  is  a  finite  constant 
such  that  for  all  i  =  l,...,m;  n  =  0,1,...; 

|Y^|  <  C  mp"1  a.s.  (4.83) 

I  n  I  —  4  n 

and  consequently  also  for  all  j  =  1 , . . . ,m 

I yCij ) I  <  2C  mp"1  a.s.  (4.84) 

I  n  1  —  4  n 


Thus  we  can  set 


-1 


7  =  2C  mp  ,  n  =  1,2  , . . . 

'n  4  n 


in  (4.46)  whence  in  (4.48) 


Vi0)' 


=  0 


a.s. 


(4.85) 


so  that  the  right-hand  side  of  (4.49)  is  zero  and  the  first  term  on  the 
right-hand  side  of  (4.73)  becomes 


-3/2  „  -l/2  3/2  ,  v-l/2 

2  '  C.  c  '  m  /  (np  ) 

4  o  n 


(4.86) 


33 


SEL-67-098 


This  implies 


(4.87) 


for  every  i  ^  j ; 
Lemma  8 « 


|E{5iiJ)s;(i))i  <  c3»3/2(npnr1/2 

n  =  1,2,...;  £€0°°,  which  together  with  (4.80)  proves 
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Chapter  V 
MAIN  THEOREMS 

Theorem  1 .  Let  {¥  :  n  =  1,2,...}  be  the  sequence  of  pure  strategies 
generated  by  the  strategic  rule  (*)  defined  in  Chapter  III;  let 
the  sequence  of  probabilities  (3.2)  satisfy  the  conditions 


and 


pn  '  0  ' 


np  t  +»  as  n  ■*  +«  ; 
n 


(5.1) 

(5.2) 


let  the  assumptions  (2 . 1 ) ,  and  (lT;r)  for  some  r  >  3  be  satisfied. 
Then 


;  I  VvV  ■  £  0 


(5.3) 


k=l 


uniformly  in  all  sequences  _££0  . 


Proof . 

00 

Let  ji  be  an  arbitrary  sequence  from  0  ;  let  for  n  =  1,2 . SF  be 

the  o-field  induced  by  the  family  of  random  vectors 


(Wo(V-> . Wn^n,*^:  Sl . Sn+1^  ’ 

where  S  is  defined  by  (*). 
n 

By  (2.16)  and  (2.17)  we  have  for  every  n  =  1,2,... 


(5.4) 


KCn'VVIVl1  "  w(i)„'Sn) 


a.s . 


(5.5) 


Furthermore,  (li? ;  r ) ,  r  >  3,  implies  that  for  every  n  =  1,2,...,  J€0 
and  1  <  r'  <  r 


Klw  (a  ,s  )|r'  <  cr'  . 

i  n  n  n  i  —  o 


(5.6) 
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Therefore,  by  Lemma  4,  (5.3)  is  equivalent  to 


n  ^  w^n,Sn^  "  ^  ®  uniformly  in  _£€0°°  . 


k=l 


Let  {£M:  n  =  0,1,..,)  be  a  sequence  of  random  vectors  defined  by 


S'  =  >"l  ^  Y, 


k  ’ 


ik=0 


where  (Y  :  n  =  0,1,...)  is  the  sequence  (3.5),  and  let  us  denote 


and 


X  =  ^  y  w(i,S  )  -  <+)(t  ) 
n  n  ^  n  n  T  n 


k=l 


n 


K  *  a  1  w(°n'Sn>  -  »<Tn> 


k=l 

n 


Xn  =  n  y  W^n«Sn  *  <P<TJ 

n  n  n  n-i  n 


k=l 


We  have 


and 


k=l 


!Xn  ■  x;l  £n  S  MvV  -  w(\'s;)l 


k=l 


Let  {U^ :  n  =  0,1,...)  be  the  sequence  of  random  variables  (3.l). 

by  (*)  and  (5.8),  Ur  =  0  =>  Sr  =  S^,  and  by  (3.6)  and  (5.8), 

U  =  0  =>  Y  =  0  =>  S'  s  S'  ,  we  conclude  that 
n  n  n  n-1 

f w(-S  ,8  )  -  w(fl  ,8'  )|  >  0  ■>  U  =  1  , 

I  '  n  n  n  n-1  1  n 


(5.7) 


(5.8) 


(5.9) 


(5.10) 


(5.11) 

Since 


(5.12) 
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and  also 

I w(i3  ,S  )  -  w(t3  , S '  )|  >  0  =>  U  =  1  .  (5.13) 

inn  n  n  I  n 

From  this  and  (5.6),  it  follows  that  the  right-hand  sides  of  both  (5.10) 
and  (5.1l)  are  both  bounded  by 

2C  -  S  U  , 
on  Zw  k 

k=l 


which  goes  to  zero  almost  surely  because  of  the  condition  (5.1 ).  Thus 
we  have 

lx  -  X'l  v*  0  and  |X  -  X"  I  a-^8*  0  ,  (5.14) 

I  n  ni  I  n  ni  ' 


iX> 

both  uniformly  in  jf<£0  . 

We  are  going  to  prove  that  the  random  variables  X^  are  bounded  from 

above  and  the  random  variables  X"  from  below  by  random  variables  that 

n 

00 

both  tend  to  zero  in  probability  uniformly  in  ££0  .  This,  in  view  of 
(5.14),  will  prove  (5.7). 

Let  us  start  with  X’.  Using  (2.14),  we  obtain 


x;  =  J  y  [kw(Tk,S^)  -  (k-l)w(Tt ,S^)]  -  <p(Tn) 
k=l 


k’  k' 


=  n[  i  k,(VSk!  ■  i  lt,(VSk.l)  •  "»('tn)l 

l-k=l  k=l  J 


n-1 


“  n  S  kt,(Tk.Si)  -  »<VSJ+1>1  +  • 


k=l 


n  —  1 1 2 1  •  •  •  • 


(5.15) 


Let 


n 

-l/2  V  ~ 

Z  =  n  '  >  Y;n=l,2,...;Z  =0, 

n  k  o 

k=0 


(5.16) 
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where  Yn  =  Yr  -  K(Yn).  n  =  0,1,...  .  By  the  definition  of  the  mapping 
s  and  by  (4.l),  we  have 


S' 

n 


+  n 


-1/2 


Z  +  T) 
n 


=  s 


(w(Tn>  •)  +  n  ^  zn^  '  n  =  1.2 . 


(5.17) 


where  t)€R  is  the  vector  with  all  components  equal  to  one.  Applying  now 
“  l/2 

Lemma  1  with  x  =  k  Z.  ,  O.  =  S',  a,  =  S'  to  the  summands  in  (5.15) 

and  with  x  =  n  '  Z  ,  a,  =  s  (w(t  ,.)),  a_  =  S'  we  obtain  in  view  of 

n'  1  \  '  n  /  2  n 

(5.17)  and  (2.6)  the  relation 


K  I  k_l/2  VLl  ■  Si)  *  ”'1/2  v(»*(,(Tn-))  •  5n) 


k=l 

n 


-  ^  1  (<k-i)1/2  h-i  -  “1/2  zrH  +  "’1/2  • 

k=i  (5. 18) 

However  by  (5.16) 


(k-l)1/2  Z  -  k^2  Zk  =  -  Yk  ;  k  =  1,2,...  ; 


(5.19) 


so  that  (5. 18)  becomes 


IX 

X*  <  -  -  y  Y  •  S'  +  n"1^2  Z  -s*(w(T  ,.))  .  (5.20) 

n-nZ-kk  n  \  '  n’  '/ 


k=l 


The  first  term  on  the  right-hand  side  of  (5.20)  goes  to  zero  in  proba¬ 
bility  uniformly  in  £€0°°  by  Lemma  6.  Next,  since  **(w(Tn»  • ) )e(2. 


n*1//2  Z  -s*(w(t  ,  . )]  <  max 

i  i  ?<4) 

n  L  k 

i=l , . . . ,m 

k=0 

(5.21) 
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*  »w*wirn*M*  a  ?**r»  %wr -rt^-i*"***  <  W  fWW  ^6**^‘ 


~(i) 

However,  the  random  variables  Y^  ;  k  =  0,1,...;  are  independent  and 
centered  at  expectation,  and  by  Lemma  2  (4.14)  and  (5.1 ) 

n  2  n„ 

n  2  ^  l^l  -  ^2Cl^2  mn  2  ^  Pr1  -  ^2Ci^2  n>n*2(n+l)p^1  ,  (5.22) 

k=0  k=0 


which  goes  to  zero  by  the  condition  (5.2),  Hence  by  the  weak  law  of 
large  numbers^  ([5],  p.  234,  prop,  b) 


i  V  Y 

n  k 

k=0 


(i) 


0  uniformly  in  _£€© 


(5.23) 


for  all  i  =  l,...,m.  Because  of  (5.2l),  the  same  is  true  for  the  last 
term  of  (5.20)  so  that 

limsup  X'  =  0  in  probability  uniformly  in  je©°°  .  (5.24) 

n-*+a> 

It  remains  to  show  that  also 


For  this, 


X" 

n 


liminf  X"  =0  in  probability  uniformly  in  £€0°°  . 
n-*+<» 

let  us  write  using  again  (2.6) 
n 


= ;  I  k[»(\.s;.i) 

k=l 


w(Vsi)!  *  »(Tn,s;)  -  <p(Tn)  . 


(5.25) 


(5.26) 


Using  (5.17)  and  applying  a$ain  Lemma  1  to  the  summands,  this  time  with 

a_  =  S '  ,  we  obtain 

Z  K*1 

w(Tk,sk-i)  '  w(Tk,sk)  ^  kl/2  zk(sk  ’  sk-i^  •  (5*27) 


Or  Lemma  4  with 


5  induced  by  {Y^1  \ . . .  ,Y^ ). 
n  1  o  n  ‘ 
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Further  by  (2.6) 


*(VSn>  ‘  ^  0  • 


■o  that 


Hence  by  (5.19)  and  the  fact  that  S'e(l,  we  obtain 

n 


(5.28) 


II 

x"  >  -  y  ki//2  z  .(s*  -  s'  ) 

n  —  n  Z-  kv  k  k-1' 

k=l 

■  »  I  (<k-l)l/2  Vl  •  1,1/2  \)\-X  ♦  »'1/2  v;  •  (5-: 


x"  >  -  -  S  y -s'  .  - 

n  —  n  Z*  k  k-1 


1=1 , . . .  ,m 


i  |  *<l)|  ,  n-1.2 . 


(5.30) 


Let,  for  every  n  =  0,1 . denote  the  cr-field  Induced  by  the  family 

n  ^ 

{Yq . Yn).  Since  the  random  vectors  Yr  and  are  independent  and 

S'  is  ?)  -measurable,  we  have  for  all  n  =  1,2,... 
n  n 

E{Y  *8'  Isj  . )  =  E(Y  )  •  S'  .  =  0  a.s,  (5.3l) 

1  n  n-l|Jn-lJ  1  n^*  n-1 

Hence,  by  Lemma  4  and  (5.22), 


ll 


~  p 

Yfc  " *  0  uniformly  in  ^€0 


and  by  (5.23)  the  same  is  true  for  the  last  term  in  (5.30).  Thus  (5.25) 


holds  and  Theorem  1  is  proven. 


Because  of  assumption  (lC;r),  the  dominated  convergence  theorem 
implies  that  under  the  conditions  of  Theorem  1  also 
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y(T  )  -»  0  uniformly  in  je0  • 


(5.32) 


It  is,  nevertheless,  of  interest  to  investigate  the  rate  of  this 
convergence. 


Theorem  2.  Let  t.  i  hypothesis  oi  Theorem  1  be  satisfied. 
Then 


K 


n 


n  I  VVVr 


k=l 


=  01  max 


uniformly  in  all  sequences  je9  . 


Proof . 

Proceeding  exactly  as  in  the  proof  of  Theorem  l,we  observe  that  the 
random  variables  (5.9)  satisfy  for  every  n  =  1,2,...  the  inequality 


X" 

n 


2C  -  V  U,  <  X  <  X*  +  2C  -  Yu, 
on  Zw  k  —  n  —  n  onZ-i  k 


k=l 


k=l 


(5.34) 


Since  the  random  variables 
we  have 


U 

n 


take  values  0  and  1  with  PfU 

n 


-  2C 


-  y  p.  <  Fix )  <  e(x’  )  +  2c  -  y 

on  k  —  n;  -  1  nJ  on  /. 


(5.35) 


k=l 


k=l 


Furthermore,  the  inequality  (5.20),  Lemma  7,  and  the  fact  that 

K{Z  )  =  0  imply 
n 

W;)  £  C2»5/2  \  S  (kpk)'l/2*l/2r  ,  (5.36 ' 

k=l 


and  the  inequality  (5.30),  independence  of  Y  and  S'  ,  and  K(Y  )  =  0, 

K  K“  1  K 

k  =  1,2, ... ,  give 
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E{x^}  > 


max  E 

i =1 1  • •  •  f m 


M 


k=l 


(5.37) 


Next,  by  Jensen  Inequality,  Bienaym6  Equality,  and  (5.22)  we  have 
for  every  i  =  1 . m 


n 

\2 

n 

2  n 

/ ] K 

i  y  ?<D 

\  <  E 

1  y  ?(D 

- 1  T  e|?-1) 

l 

n  ^  k 
k=l 

r 

n  ZL  k 
k=l 

n  Z*  Ik 
k=l 

<  (2C  )2  m(np  )  1  , 
i  n 


so  that  (5.35)  becomes 

n 

-  2c//2  <»pnr1/2  ■  2co  i  ^  pk  <  E[xn) 

k«l 


k=l 


k=l 


However,  by  (5.l),  (5.2),  and  r  >  3, 


k=l 


so  that  the  first  term  in  (5.39)  is  lower-bounded  by 

-ii  <%>-i/2ti/2r- 


k=l 


Theorem  2  is  proven. 


(5.38) 


(5.40) 
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Theorem  3.  Let  tii  hypothesis  of  Theorem  1  be  satisfied  with  the 
assumption  (U';r)  replaced  by  the  assumption  (U  ;«0, 

Then 


uniformly  in  all  sequences  $£0. 

Proof . 

The  proof  is  verbatim  that  of  Theorem  2,  except  that  Lemma  8  is 
used  instead  of  Lemma  7  so  that  the  exponent  l/2r  vanishes  from  (5.36), 
(5.39),  and  (5.40),  and  the  constant  C„  is  replaced  by  the  constant  C„ 
of  Lemma  8. 


To  the  end  of  this  chapter,  let  us  consider  the  case  in  which  the 
sequences  ^  of  Nature's  strategies  are  sequences  of  independent  identically 
distributed  random  variables.  More  precisely,  let  us  suppose  that  Nature 
at  the  beginning  of  the  sequence  of  plays  selects  a  probability  measure 
T  from  a  class  of  probability  measures  defined  on  the  measurable  space 
(0,^).  The  class  and  the  <j-field  3^ ,  not  necessarily  identical  with 
those  introduced  in  Chapter  II,  are  supposed  to  be  given,  however,  unknown 
to  the  player.  We  also  assume  that  the  random  loss  function  li  is  such 
that  for  every  aeA,  T€T^>w(.,a)  is  for  almost  every  a  ^-measurable 
and  T-integrable  function. 

In  this  setup,  let  TGT^  and  let 

{$n:  n  =  1,2,...)  (5.42) 

be  a  sequence  of  independent  identically  distributed  random  mappings  from 
into  (0,^)  such  that  P^1  =  T  and  such  that  the  sequences  (5.40), 
(3.l),  (3.3),  and  (2.15)  for  any  fixed  £€0°°  are  mutually  independent. 
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Theorem  1  then  yields  immediately 


n 

n  Wk^V*k^  "  ^  °  uniformly  ln  T€T1  •  (5.43) 

k=l 


However,  we  may  establish  a  stronger  convergence  in  this  case. 

Theorem  4.  Let  n  =  1,2,...}  be  a  sequence  of  pure  strategies 

generated  by  the  strategic  rule  (*)  of  Chapter  III;  let  the  sequence 
of  probabilities  (3.2)  satisfy  the  conditions 

p„  ,  0 
and 

1  -e 

n  p  t  +  oo  for  some  £  >  0  ; 
n 

let  the  assumptions  (2.2)  and  (10; 2 )  be  satisfied.  Then 

n 

;  I  WV  -  kW  ‘i8' 0  <5-46> 

k=l 


(5.44) 

(5.45) 


uniformly  in  all  TeT^. 

Proof . 

Let  TeT,  ana  let  again  ?  denote  the  a-field  induced  by  the  family 
1  n 

(5.4),  where  now  ♦1,*2,...  is  the  sequence  (5,42).  By  the  measurability 
and  integrabillty  assumption  we  made  about  the  random  loss  function,  we 
have  now  instead  of  (5.5) 

E{Wn{4n'fnJIVl)  =  w(T’SnJ  a'8*  (5,47) 

Hence,  using  the  assumption  (10; 2)  and  Stability  Theorem  ([5],  p.  387), 
we  conclude  that  (5.44)  is  equivalent  to  showing 
n 

—  V  w(t,S  )  -  qj(t)  a-+B*  0  uniformly  in  TeT  .  (5.48) 

n  n  1 

k=l 
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Let  X  ,  X1  and  X"  be  defined  as  in  (5.9)  with  $  and  T  replaced  by  T. 
n  n  n  n  n 

By  the  same  reasoning  as  in  the  proof  of  Theorem  1,  we  conclude  that 
(5.14)  holds  again,  now  uniformly  in  TeT^ . 

By  the  relation  (5,17)  and  Lemma  1  we  have 


w(t.s;)  -  *(T)  <i  £  v[**HT->)  • s;. 

k=0 


<  max 
i=l , . . . ,m 


i  2 

k=0 


;(i) 


(5.49) 


and  by  the  assumption  (li';2)  and  Lemma  2  (4.14)  for  every  i  =  1 . m 

2  n 

]>  k'2E|Y^|  <  (2C1)2m  ]>  k'2  p^1  .  (5.50) 

k=l  k=l 


However,  by  (5.45),  the  series 


00 


k=l 


converges  so  that  by  the  strong  law  of  large  numbers  ([5],  p.  238,  prop  A) 
the  right-hand  side  of  (5.49)  converges  to  zero  almost  surely  uniformly 
in  T€Tj . 

As  for  ,  by  (5,3l),  (5. 50 ),  and  Stability  Theorem 
n 

—  )  Y  S'  a^S'  0  uniformly  in  TeT  ,  (5.51 ) 

k=l 

so  that  the  same  is  true  for  both  terms  on  the  right-hand  side  of  (5.30). 

Therefore.  (5,24)  and  (5,25)  hold  almost  surely  uniformly  in  TeT^ , 
which  together  with  (5,14)  terminates  the  proof. 

Theorems  2  and  3  appear  also  in  a  stronger  version. 
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Theorem  5.  Let  n  =  1,2,...]  be  a  sequence  of  pure  strategies 

generated  by  the  strategic  rule  (*)  of  Chapter  III;  let  the 
sequence  of  probabilities  (3.2)  satisfy  the  conditions  (5.l)  and 
(5.2)  of  Theorem  1;  let  the  assumptions  (2.2)  and  (li? ; 2 )  hold. 
Then 


|E(wn(<I'n'^n^  ‘  =  °(raftx(Pn>(nPn)*1//2)) 


(5.52) 


uniformly  in  all  T€T^ . 


Proof . 

Proceeding  as  in  the  proof  of  Theorem  4,  we  obtain  the  inequality 
E(w(T  ,8^)}  -  cp(t)  -  2Cq  pn  <  E{w(®n,¥n))  -  cp(T) 


<  E{w(T,S^)]  -  <p(x)  +  2Cq  pn  . 


From  (5.49)  and  (5,38),  we  have 


Next,  by  (3.28)  and  Lemma  1 


E(w(t,s;_i))  -  9(t)  >  EMt.s^)  -  w(T,s^)) 

n 

1 1 


>  -  max  E 
i — 1 1  #  •  •  f  m 


1  V  s(i) 

k 


k=0 


so  that  by  (5.22)  also 

EMt.S^)}  -  qp(T)  >  -2C1  mn'1  (n  +  l)l/2  p‘l/2  . 

Hence  (5.56),  (5.54)  and  (5.53)  give  the  statement. 

The  theorem  is  proven. 


(5.53) 


E(w(T,Sn))  -  <p(t)  <  2CX  mn"1  (n+l)1^2  p .  (5.54) 


(5.55) 


(5.56) 
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Chapter  VI 
CONCLUDING  REMARKS 

In  this  last  chapter,  let  us  make  a  few  remarks  concerning  various 
assumptions  we  have  made,  possible  generalizations,  and  relationships 
to  other  problems. 

First,  let  us  consider  the  rate  of  convergence  established  in 
Theorems  2,  3,  and  5.  It  is  easy  to  see  that,  for  example,  the  choice 

p  =nC*,0<q<l,  which  satisfies  both  the  conditions  (5.l)  and  (5.2) 

n  - 1  -  4r / 1- 3r 

and  (5.44)  and  (5.45),  yields  the  maximum  rate  o(n  '  )  under  the 

assumptions  of  Theorem  2,  that  is,  o(n  )  in  the  most  unfavorable  case 
of  r  =  3,  and  the  maximum  rate  0(n  '  )  under  the  assumptions  of  Theorems 

3  and  5.  This  is,  of  course,  considerably  slower  than  the  typical  rate 
0(n  '  )  of  Blackwell's  and  Hannan's  rules  and  others  derived  from  them. 

Notice,  however,  that  the  strategic  rule  suggested  in  this  paper  does 
not  make  use  of  all  the  information  available  to  the  player  since  the 
information  obtained  during  active  plays  is  disregarded,  principally  for 
the  sake  of  simplifying  the  proofs.  It  is,  nevertheless,  conceivable 
that  if  the  disregarded  information  were  used,  the  rate  of  convergence 
might  be  improved.  One  way  of  doing  this  may  be  to  record  the  Ions  dur¬ 
ing  active  plays  as  well  and  switch  to  another  strategy  as  soon  as  the 
accumulated  loss  for  the  strategy  being  used  decreases  under  the  next 
largest  value  of  accumulated  losses  recorded  before. 

As  far  as  the  assumptions  of  the  theorems  are  concerned,  the  assump¬ 
tion  (li';r)  was  necessary  for  the  proofs  and  can  hardly  be  removed  unless 
the  method  of  proofs  is  changed  considerably.  The  assumption  (2.2)  is 
merely  a  technical  matter,  as  mentioned  earlier.  The  remaining  assump¬ 
tion  (2.l)  was  introduced  to  keep  the  variance  of  the  random  variables 

nonzero  (see  Lemmas  2  and  3).  For  the  same  reason,  the  vector  q 
n 

[see  (3.6)]  appeared  in  the  definition  of  the  strategic  rule  (*).  Some 


considerations  indicate,  however,  that  the  same  effect  would  be  achieved 
without  the  assumption  (2.l)  if  we  replaced  the  vector  >|  in  (3.6)  by  the 
random  vector  with  i-th  component  (i  =  l,...,m)  equal  to  sign  (w  (tf  ,a^). 
Notice  also  that  neither  the  assumption  (2.1;  nor  the  vector  q  is  needed 


to  establish  the  truth  of  Theorems  4  and  5. 
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In  Theorem  1,  we  have  proven  uniform  convergence  in  probability. 
Naturally,  a  question  arises  of  whether  a  stronger  convergence--!. e,  , 
whether  convergence  almost  surely  can  be  proven.  This  may  quite  be  the 
case;  we  were,  however,  unable  to  do  this  here  and  the  question  remains 

\j 

open . 

Next,  let  us  discuss  briefly  the  possibility  of  generalizing  the 

results  to  infinite  games  (i.e.,  when  the  set  A  is  infinite).  For  A 

denumerable,  this  might  in  principle  be  done  by  letting  the  random 

vectors  V  take  values  in  finite  subsets  A  CA  (again  with  uniform  dis- 
n  n 

tribution  on  the  vertices  of  the  simplex  spanned  on  A^)  and  then  let 

m  ,  the  cardinality  oi  A  ,  tend  to  infinity  slowly  enough.  It  is  clear, 
n  n 

however,  that  in  this  case  the  convergence  An  Theorems  1-5  may  not  be 
uniform  unless  the  generic  game  possesses  some  properties  (e.g. ,  to  be 
totally  bounded  in  the  sense  of  Wald's  metric)  that  makes  it  approximable 
by  finite  games.  This  is  even  more  evident  for  the  case  in  which  A  is 
uncountable.  Nevertheless,  there  is  a  large  class  of  games  with  the 
above  properties  (e.g.,  games  on  the  unit  square,  polynomial  games,  etc.) 
so  that  some  Investigation  in  this  direction  might  be  worthwhile. 

We  would  also  like  to  mention  that  the  problem  studied  here  is  closely 
related  to  the  so-called  "two-armed  bandit  problem"  (see,  e.g.,  ref.  [7]); 
in  fact,  it  includes  the  latter  as  a  degenerate  case.  The  two-armed  (or 
more  generally  m-armed)  bandit  problem  can  be  briefly  described  as  fol¬ 
lows:  Given  are  m  independent  random  experiments  with  outcomes  0  (success) 

and  1  (failure)  having  probabilities  1  -  n.  and  ;  i  =  l,...,m;  respec¬ 
tively,  which  are  either  unknown  or  to  which  of  the  m  experiments  a  par¬ 
ticular  pair  (a  ,  1  -  )  belongs  is  unknown.  These  experiments  are 

independently  repeated;  at  each  step  only  one  of  them  is  allowed  to  be 
performed.  The  problem  is  to  find  a  rule  for  performing  these  experiments 
that  would  minimize  the  expected  average  number  of  failures.  Clearly, 

no  rule  can  do  better  than  to  achieve  min  n  J  which  is  nothing 

l  m 

but  the  value  of  the  minimum  functional  ‘Y  of  the  game  with  random  loss 
function  (  ,A,u),  where  j  is  one-element  set  (••)  and 


«( 


1  with  probability 
0  with  probability  1-,,^ 


i  =  1 . m  . 
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Thus  our  rule  can  be  used  for  this  problem  and  we  conclude  from 
Theorem  4  that  the  average  number  of  failures  converges  almost  surely  to 

minfix, . a  }  while  Theorem  5  states  that  the  expectation  of  a  failure 

1  m 

in  n-th  experiment  converges  to  min{n^  , . . .  , nm }  with  the  rate 
0(max{pn, (npn)  1/2)). 

Finally,  let  us  make  a  few  remarks  about  the  case  when,  instead  of 
a  sequential  game  against  Nature,  we  consider  a  sequential  game  against 
an  opponent  (malicious  or  not);  that  is,  if  we  allow  the  strategies  5 

n 

of  the  first  player  to  depend  on  the  past  strategies  of  the  second 
player  and  on  the  losses  incurred.  This  is  the  case  studied  recently  by 
A.  Banos  [l]  under  the  same  assumptions  about  the  information  available 
to  the  second  player  as  we  have  made  and  under  nearly  the  same  assump¬ 
tions  about  the  generic  game,  viz.  ,  A  finite  and  (IC;2).  He  succeeded 
in  exhibiting  a  strategic  rule  for  the  player  with  the  property  that 
the  average  loss  is,  with  probability  one,  asymptotically  not  greater 
than  the  value  of  the  generic  game  (=  maximum  of  the  minimum  functional). 
However,  as  can  easily  be  seen,  his  strategic  rule  need  not  have  the 
optimum  property  (5.3)  even  in  the  sequential  game  against  Nature.  On 
the  other  hand,  our  rule  fails  to  have  this  optimum  property  in  a  sequen¬ 
tial  game  against  an  opponent  and,  in  general,  does  not  even  guarantee 
that  the  average  loss  will  achieve  the  value  of  the  game.  This  can  be 
seen  from  the  following  simple  example  (due  to  T.  M.  Cover):  Consider 
the  generic  game  of  "matching  pennies,"  i.e.,  A  =  0  =  (0,1)  and  W^.a)  = 
w(iD.a)  =  I*- a |  nonrandom,  and  suppose  that  the  opponent  decided  to  play 
for  each  n  =  1,2,...,  3  ,  =  1  if  f  =0,  and  vice  versa.  Since, 

according  to  our  rule,  the  player  does  not  change  his  strategy  between 
successive  active  plays  and  since  the  condition  p^  i  0  implies  that  long 
runs  of  active  plays  will  occur  more  and  more  often,  the  average  loss 
incurred  by  the  player  will  tend  to  1  while  the  value  of  the  game  is  1/2. 
Thus  the  only  rule  known  at  this  time  which  retains  the  optimum  property 
(5.3)  (in  the  a.s.  sense)  in  both  the  sequential  game  against  Nature  and 
against  an  opponent  is  the  rule  of  D.  Blackwell  [3],  The  question  remains 
open  whether  a  similar  rule  exists  even  when  the  player's  information  is 
so  severely  limited  as  assumed  in  this  paper. 
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Joint  Services  Electronics  Programs 
(U.S.  Army,  U.S.  Navy,  U.S.  Air  Force) 


IT  abstract 


A  repetitive  play  of  a  game  against  Nature  is  considered  under  the  assumption 
that  the  player  knows  nothing  about  the  game  except  his  own  set  of  strategies. 
After  each  play,  he  is  told  the  value  of  the  random  loss  incurred  by  him.  A 
strategic  rule  for  the  player  is  defined  with  the  property  that  the  average  loss 
achieves  asymptotically  the  minimum  functional  of  the  game  in  probability  and 
uniformly  in  all  sequences  of  Nature's  strategies.  The  rate  of  convergence  of 
expected  average  losses  is  shown  as  well. 
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